Generating Better Decision Trees
نویسنده
چکیده
A new decision tree learning algorithm called IDX is described. More general than existing algorithms, IDX addresses issues of decision tree quality largely overlooked in the artificial intelligence and machine learning literature. Decision tree size, error rate, and expected classification cost are just a few of the quality measures it can exploit. Furthermore, decision trees of varying quality can be induced simply by adjusting the complexity of the algorithm. Quality should be addressed during decision tree construction since retrospective pruning of the tree, or of a derived rule set, may be unable to compensate for inferior splitting decisions. The complexity of the algorithm, the basis for the heuristic it embodies, and the results of three different sets of experiments are described.
منابع مشابه
Effect of Pruning and Early Stopping on Performance of a Boosting Ensemble
Generating an architecture for an ensemble of boosting machines involves making a series of design decisions. One design decision is whether to use simple “weak learners” such as decision tree stumps or more complicated weak learners such as large decision trees or neural networks. Another design decision is the training algorithm for the constituent weak learners. Here we concentrate on binary...
متن کاملPredicting The Type of Malaria Using Classification and Regression Decision Trees
Predicting The Type of Malaria Using Classification and Regression Decision Trees Maryam Ashoori1 *, Fatemeh Hamzavi2 1School of Technical and Engineering, Higher Educational Complex of Saravan, Saravan, Iran 2School of Agriculture, Higher Educational Complex of Saravan, Saravan, Iran Abstract Background: Malaria is an infectious disease infecting 200 - 300 million people annually. Environme...
متن کاملGenerating A urate Rule Sets Without Global Optimization
The two dominant schemes for rule-learning, C4.5 and RIPPER, both operate in two stages. First they induce an initial rule set and then they refine it using a rather complex optimization stage that discards (C4.5) or adjusts (RIPPER) individual rules to make them work better together. In contrast, this paper shows how good rule sets can be learned one rule at a time, without any need for global...
متن کاملGenerating Rule-Based Trees from Decision Trees for Concept-based Information Retrieval
Web-based information retrieval systems may result in poor levels of precision and recall when users are required to articulate their own queries. Concept-based information retrieval attempts to solve this problem by allowing users to select from concept definitions specified by experts. However, it is unrealistic to expect experts to define every concept which will be of interest to users. The...
متن کاملoosting, a C4.5
Breiman’s bagging and Freund and Schapire’s boosting are recent methods for improving the predictive power of classifier learning systems. Both form a set of classifiers that are combined by voting, bagging by generating replicated bootstrap samples of the data, and boosting by adjusting the weights of training instances. This paper reports results of applying both techniques to a system that l...
متن کامل